class 2
Prospects for quantum advantage in machine learning from the representability of functions
Masot-Llima, Sergi, Gil-Fuster, Elies, Bravo-Prieto, Carlos, Eisert, Jens, Guaita, Tommaso
Quantum machine learning (QML) is recognized as a promising approach to harness quantum computing for learning tasks [1-3]. As with all quantum algorithms, a central question is whether QML holds potential for quantum advantage [4-7] over classical computing. The counter-narrative to quantum advantage is dequantization, where upon close inspection certain quantum algorithms yield no benefit over classical counterparts, as one can classically solve the task at hand. Dequantization of quantum algorithms for machine learning, in particular, has seen a surge of interest in recent years, leaving few claims of quantum advantage unchallenged [8-12]. While QML models for classical data can be studied from several perspectives, significant theoretical developments have emerged from investigating the function families that parametrized quantum circuits (PQCs) can give rise to [8, 10, 13-16]. Characterizing the functional forms arising from PQCs allows us to delineate the boundaries of quantum learning and guide the search for advantage.
Feature Selection and Regularization in Multi-Class Classification: An Empirical Study of One-vs-Rest Logistic Regression with Gradient Descent Optimization and L1 Sparsity Constraints
Arafat, Jahidul, Tasmin, Fariha, Poudel, Sanjaya
Multi-class wine classification presents fundamental trade-offs between model accuracy, feature dimensionality, and interpretability - critical factors for production deployment in analytical chemistry. This paper presents a comprehensive empirical study of One-vs-Rest logistic regression on the UCI Wine dataset (178 samples, 3 cultivars, 13 chemical features), comparing from-scratch gradient descent implementation against scikit-learn's optimized solvers and quantifying L1 regularization effects on feature sparsity. Manual gradient descent achieves 92.59 percent mean test accuracy with smooth convergence, validating theoretical foundations, though scikit-learn provides 24x training speedup and 98.15 percent accuracy. Class-specific analysis reveals distinct chemical signatures with heterogeneous patterns where color intensity varies dramatically (0.31 to 16.50) across cultivars. L1 regularization produces 54-69 percent feature reduction with only 4.63 percent accuracy decrease, demonstrating favorable interpretability-performance trade-offs. We propose an optimal 5-feature subset achieving 62 percent complexity reduction with estimated 92-94 percent accuracy, enabling cost-effective deployment with 80 dollars savings per sample and 56 percent time reduction. Statistical validation confirms robust generalization with sub-2ms prediction latency suitable for real-time quality control. Our findings provide actionable guidelines for practitioners balancing comprehensive chemical analysis against targeted feature measurement in resource-constrained environments.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Alabama (0.04)
- Oceania > Australia (0.04)
- (4 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.95)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.86)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)
Peptidomic-Based Prediction Model for Coronary Heart Disease Using a Multilayer Perceptron Neural Network
Coronary heart disease (CHD) is a leading cause of death worldwide and contributes significantly to annual healthcare expenditures. To develop a non-invasive diagnostic approach, we designed a model based on a multilayer perceptron (MLP) neural network, trained on 50 key urinary peptide biomarkers selected via genetic algorithms. Treatment and control groups, each comprising 345 individuals, were balanced using the Synthetic Minority Over-sampling Technique (SMOTE). The neural network was trained using a stratified validation strategy. Using a network with three hidden layers of 60 neurons each and an output layer of two neurons, the model achieved a precision, sensitivity, and specificity of 95.67 percent, with an F1-score of 0.9565. The area under the ROC curve (AUC) reached 0.9748 for both classes, while the Matthews correlation coefficient (MCC) and Cohen's kappa coefficient were 0.9134 and 0.9131, respectively, demonstrating its reliability in detecting CHD. These results indicate that the model provides a highly accurate and robust non-invasive diagnostic tool for coronary heart disease.
- Oceania > Australia (0.04)
- North America > United States > Michigan (0.04)
- North America > Mexico (0.04)
- (3 more...)
- Research Report > Experimental Study (0.69)
- Research Report > New Finding (0.46)
Will AI Take My Job? Evolving Perceptions of Automation and Labor Risk in Latin America
Cremaschi, Andrea, Lee, Dae-Jin, Leonelli, Manuele
As artificial intelligence and robotics increasingly reshape the global labor market, understanding public perceptions of these technologies becomes critical. We examine how these perceptions have evolved across Latin America, using survey data from the 2017, 2018, 2020, and 2023 waves of the Lati-nobar ometro. Drawing on responses from over 48,000 individuals across 16 countries, we analyze fear of job loss due to artificial intelligence and robotics. Using statistical modeling and latent class analysis, we identify key structural and ideological predictors of concern, with education level and political orientation emerging as the most consistent drivers. Our findings reveal substantial temporal and cross-country variation, with a notable peak in fear during 2018 and distinct attitudinal profiles emerging from latent segmentation. These results offer new insights into the social and structural dimensions of AI anxiety in emerging economies and contribute to a broader understanding of public attitudes toward automation beyond the Global North.
- North America > Central America (0.61)
- South America > Brazil (0.05)
- South America > Paraguay (0.04)
- (23 more...)
- Research Report > New Finding (1.00)
- Questionnaire & Opinion Survey (1.00)
- Research Report > Experimental Study (0.94)
- Banking & Finance > Economy (0.91)
- Law (0.69)
- Education (0.68)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
An Explainable and Interpretable Composite Indicator Based on Decision Rules
Corrente, Salvatore, Greco, Salvatore, Słowiński, Roman, Zappalà, Silvano
Composite indicators are widely used to score or classify units evaluated on multiple criteria. Their construction involves aggregating criteria evaluations, a common practice in Multiple Criteria Decision Aiding (MCDA). In MCDA, various methods have been proposed to address key aspects of multiple criteria evaluations, such as the measurement scales of the criteria, the degree of acceptable compensation between them, and their potential interactions. However, beyond producing a final score or classification, it is essential to ensure the explainability and interpretability of results as well as the procedure's transparency. This paper proposes a method for constructing explainable and interpretable composite indicators using " if..., then... " decision rules. We consider the explainability and interpretability of composite indicators in four scenarios: (i) decision rules explain numerical scores obtained from an aggregation of numerical codes corresponding to ordinal qualifiers; (ii) an obscure numerical composite indicator classifies units into quantiles; (iii) given preference information provided by a Decision Maker in the form of classifications of some reference units, a composite indicator is constructed using decision rules; (iv) the classification of a set of units results from the application of an MCDA method and is explained by decision rules. To induce the rules from scored or classified units, we apply the Dominance-based Rough Set Approach. The resulting decision rules relate the class assignment or unit's score to threshold conditions on values of selected indicators in an intelligible way, clarifying the underlying rationale. Moreover, they serve to recommend composite indicator assessment for new units of interest.
- Asia > Laos (0.14)
- South America > Brazil (0.04)
- Africa > Senegal (0.04)
- (182 more...)
- Health & Medicine (1.00)
- Materials > Metals & Mining (0.93)
AI-Assisted Decision-Making for Clinical Assessment of Auto-Segmented Contour Quality
Wang, Biling, Maniscalco, Austen, Bai, Ti, Wang, Siqiu, Dohopolski, Michael, Lin, Mu-Han, Shen, Chenyang, Nguyen, Dan, Huang, Junzhou, Jiang, Steve, Wang, Xinlei
Purpose: This study introduces a novel Deep Learning (DL) - based q uality a sses s ment (QA) approach specifically designed for evaluating auto - generated contours (auto - contour s) in auto - segmentation for radiotherapy, with a focus on Online Adaptive Radiotherapy (OART). The proposed method leverages Bayesian Ordinal Classification (BOC), combined with cali brated thresholds derived from uncertainty quantification, to deliver confident QA predictions . This approach address es key challenges in clinical auto - segmentation QA workflows such as the absence of ground truth contours, limited availability of manually labeled data, and inherent uncertainty in AI model predictions . Methods: We developed a BOC model to classify the quality of auto - contour s and quantify uncertainty. To enhance predictive reliability, we implemented a calibration step to determine optimal uncertainty thresholds that meet specific clinical accuracy requirements . The method was validated under three distinct data availability scenarios: absence of manual labels, limited manual labeling, and extensive manual labeling. We specifically tested our method for auto - segmented rectum contours in prostate cancer radiotherapy. Geometric surrogate labels were employed in the absence of manual labels, transfer learning was applied when manual labels were limited, and direct use of manual labels was perf ormed when extensive labeling was available. Results: The BOC model demonstrated robust performance across all data availability scenarios for confident predictions, with significant accuracy gains when pre - trained with surrogate labels and fine - tuned with limited manual ly label ed data . Specifically, fine - tuning the pretrained model with just 30 manually labeled cases and calibrating with 34 subjects achieved over an accuracy of over 90% against manual labels in the test dataset .
- North America > United States > Texas > Dallas County > Dallas (0.04)
- North America > United States > Texas > Tarrant County > Arlington (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
An analysis of the combination of feature selection and machine learning methods for an accurate and timely detection of lung cancer
Shahriyar, Omid, Moghaddam, Babak Nuri, Yousefi, Davoud, Mirzaei, Abbas, Hoseini, Farnaz
One of the deadliest cancers, lung cancer necessitates an early and precise diagnosis. Because patients have a better chance of recovering, early identification of lung cancer is crucial. This review looks at how to diagnose lung cancer using sophisticated machine learning techniques like Random Forest (RF) and Support Vector Machine (SVM). The Chi-squared test is one feature selection strategy that has been successfully applied to find related features and enhance model performance. The findings demonstrate that these techniques can improve detection efficiency and accuracy while also assisting in runtime reduction. This study produces recommendations for further research as well as ideas to enhance diagnostic techniques. In order to improve healthcare and create automated methods for detecting lung cancer, this research is a critical first step.
- Asia > Middle East > Iran > Ardabil Province > Ardabil (0.04)
- North America > United States > Michigan (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.89)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.71)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
Controlling Out-of-Domain Gaps in LLMs for Genre Classification and Generated Text Detection
Roussinov, Dmitri, Sharoff, Serge, Puchnina, Nadezhda
This study demonstrates that the modern generation of Large Language Models (LLMs, such as GPT-4) suffers from the same out-of-domain (OOD) performance gap observed in prior research on pre-trained Language Models (PLMs, such as BERT). We demonstrate this across two non-topical classification tasks: 1) genre classification and 2) generated text detection. Our results show that when demonstration examples for In-Context Learning (ICL) come from one domain (e.g., travel) and the system is tested on another domain (e.g., history), classification performance declines significantly. To address this, we introduce a method that controls which predictive indicators are used and which are excluded during classification. For the two tasks studied here, this ensures that topical features are omitted, while the model is guided to focus on stylistic rather than content-based attributes. This approach reduces the OOD gap by up to 20 percentage points in a few-shot setup. Straightforward Chain-of-Thought (CoT) methods, used as the baseline, prove insufficient, while our approach consistently enhances domain transfer performance.
- Europe > United Kingdom (0.14)
- North America > Jamaica (0.04)
- Asia > Singapore (0.04)
- (9 more...)
- Media (0.68)
- Leisure & Entertainment (0.68)
- Health & Medicine > Therapeutic Area (0.68)
- Information Technology > Security & Privacy (0.46)
THESAURUS: Contrastive Graph Clustering by Swapping Fused Gromov-Wasserstein Couplings
Deng, Bowen, Wang, Tong, Fu, Lele, Huang, Sheng, Chen, Chuan, Zhang, Tao
Graph node clustering is a fundamental unsupervised task. Existing methods typically train an encoder through selfsupervised learning and then apply K-means to the encoder output. Some methods use this clustering result directly as the final assignment, while others initialize centroids based on this initial clustering and then finetune both the encoder and these learnable centroids. However, due to their reliance on K-means, these methods inherit its drawbacks when the cluster separability of encoder output is low, facing challenges from the Uniform Effect and Cluster Assimilation. We summarize three reasons for the low cluster separability in existing methods: (1) lack of contextual information prevents discrimination between similar nodes from different clusters; (2) training tasks are not sufficiently aligned with the downstream clustering task; (3) the cluster information in the graph structure is not appropriately exploited. To address these issues, we propose conTrastive grapH clustEring by SwApping fUsed gRomov-wasserstein coUplingS (THESAURUS). Our method introduces semantic prototypes to provide contextual information, and employs a cross-view assignment prediction pretext task that aligns well with the downstream clustering task. Additionally, it utilizes Gromov-Wasserstein Optimal Transport (GW-OT) along with the proposed prototype graph to thoroughly exploit cluster information in the graph structure. To adapt to diverse real-world data, THESAURUS updates the prototype graph and the prototype marginal distribution in OT by using momentum. Extensive experiments demonstrate that THESAURUS achieves higher cluster separability than the prior art, effectively mitigating the Uniform Effect and Cluster Assimilation issues
- North America > United States > New York > New York County > New York City (0.14)
- Asia > China > Guangdong Province > Guangzhou (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- (3 more...)
Siamese Machine Unlearning with Knowledge Vaporization and Concentration
Xie, Songjie, He, Hengtao, Song, Shenghui, Zhang, Jun, Letaief, Khaled B.
In response to the practical demands of the ``right to be forgotten" and the removal of undesired data, machine unlearning emerges as an essential technique to remove the learned knowledge of a fraction of data points from trained models. However, existing methods suffer from limitations such as insufficient methodological support, high computational complexity, and significant memory demands. In this work, we propose the concepts of knowledge vaporization and concentration to selectively erase learned knowledge from specific data points while maintaining representations for the remaining data. Utilizing the Siamese networks, we exemplify the proposed concepts and develop an efficient method for machine unlearning. Our proposed Siamese unlearning method does not require additional memory overhead and full access to the remaining dataset. Extensive experiments conducted across multiple unlearning scenarios showcase the superiority of Siamese unlearning over baseline methods, illustrating its ability to effectively remove knowledge from forgetting data, enhance model utility on remaining data, and reduce susceptibility to membership inference attacks.
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Asia > China > Hong Kong (0.04)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)